Exploiting both local and global constraints for multi-span statistical language modeling
نویسنده
چکیده
A new framework is proposed to integrate the various constraints, both local and global, that are present in the language. Local constraints are captured via ngram language modeling, while global constraints are taken into account through the use of latent semantic analysis. An integrative formulation is derived for the combination of these two paradigms, resulting in several families of multi-span language models for large vocabulary speech recognition. Because of the inherent complementarity in the two types of constraints, the performance of the integrated language models, as measured by perplexity, compares favorably with the corresponding n-gram performance.
منابع مشابه
Multi-Span statistical language modeling for large vocabulary speech recognition
The goal of multi-span language modeling is to integrate the various constraints, both local and global, that are present in the language. In this paper, local constraints are captured via the usual n-gram approach, while global constraints are taken into account through the use of latent semantic analysis. An integrative formulation is derived for the combination of these two paradigms, result...
متن کاملOF THE IEEE , AUGUST 2000 1 Exploiting Latent Semantic Information in Statistical Language Modeling Jerome
| Statistical language models used in large vocabulary speech recognition must properly encapsulate the various constraints, both local and global, present in the language. While local constraints are readily captured through n-gram modeling, global constraints, such as long-term semantic dependencies, have been more diÆcult to handle within a data-driven formalism. This paper focuses on the us...
متن کاملContext scope selection in multi-Span statistical language modeling
A multi-span framework was recently proposed to integrate the various constraints, both local and global, that are present in the language. In this approach, local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. The complementarity between these two paradigms translates into improved modeling per...
متن کاملSpeech recognition experiments using multi-span statistical language models
A multi-span framework was recently proposed to integrate the various constraints, both local and global, that are present in the language. In this approach, local constraints are captured via n-gram language modeling, while global constraints are taken into account through the use of latent semantic analysis. The performance of the resulting multispan language models, as measured by perplexity...
متن کاملLarge vocabulary speech recognition with multispan statistical language models
Multispan language modeling refers to the integration of the various constraints, both local and global, present in the language. It was recently proposed to capture global constraints through the use of latent semantic analysis, while taking local constraints into account via the usual n-gram approach. This has led to several families of data-driven, multispan language models for large vocabul...
متن کامل